Goto

Collaborating Authors

 McLean



How one controversial startup hopes to cool the planet

MIT Technology Review

And why many scientists are freaked out about the first serious for-profit company moving into the solar geoengineering field. Stardust Solutions believes that it can solve climate change--for a price. The Israel-based geoengineering startup has said it expects nations will soon pay it more than a billion dollars a year to launch specially equipped aircraft into the stratosphere. Once they've reached the necessary altitude, those planes will disperse particles engineered to reflect away enough sunlight to cool down the planet, purportedly without causing environmental side effects. The proprietary (and still secret) particles could counteract all the greenhouse gases the world has emitted over the last 150 years, the company stated in a 2023 pitch deck it presented to venture capital firms. In fact, it's the "only technologically feasible solution" to climate change, the company said. The company disclosed it raised $60 million in funding in October, marking by far the largest known funding round to date for a startup working on solar geoengineering.



Quick-Draw Bandits: Quickly Optimizing in Nonstationary Environments with Extremely Many Arms

arXiv.org Machine Learning

Canonical algorithms for multi-armed bandits typically assume a stationary reward environment where the size of the action space (number of arms) is small. More recently developed methods typically relax only one of these assumptions: existing non-stationary bandit policies are designed for a small number of arms, while Lipschitz, linear, and Gaussian process bandit policies are designed to handle a large (or infinite) number of arms in stationary reward environments under constraints on the reward function. In this manuscript, we propose a novel policy to learn reward environments over a continuous space using Gaussian interpolation. We show that our method efficiently learns continuous Lipschitz reward functions with $\mathcal{O}^*(\sqrt{T})$ cumulative regret. Furthermore, our method naturally extends to non-stationary problems with a simple modification. We finally demonstrate that our method is computationally favorable (100-10000x faster) and experimentally outperforms sliding Gaussian process policies on datasets with non-stationarity and an extremely large number of arms.


Early Prediction of Alzheimer's and Related Dementias: A Machine Learning Approach Utilizing Social Determinants of Health Data

arXiv.org Artificial Intelligence

Alzheimer's disease and related dementias (AD/ADRD) represent a growing healthcare crisis affecting over 6 million Americans. While genetic factors play a crucial role, emerging research reveals that social determinants of health (SDOH) significantly influence both the risk and progression of cognitive functioning, such as cognitive scores and cognitive decline. This report examines how these social, environmental, and structural factors impact cognitive health trajectories, with a particular focus on Hispanic populations, who face disproportionate risk for AD/ADRD. Using data from the Mexican Health and Aging Study (MHAS) and its cognitive assessment sub study (Mex-Cog), we employed ensemble of regression trees models to predict 4-year and 9-year cognitive scores and cognitive decline based on SDOH. This approach identified key predictive SDOH factors to inform potential multilevel interventions to address cognitive health disparities in this population. Introduction Alzheimer's disease and related dementias (AD/ADRD) pose an escalating medical and public health challenge, currently affecting over 6 million Americans.


SEOE: A Scalable and Reliable Semantic Evaluation Framework for Open Domain Event Detection

arXiv.org Artificial Intelligence

Automatic evaluation for Open Domain Event Detection (ODED) is a highly challenging task, because ODED is characterized by a vast diversity of un-constrained output labels from various domains. Nearly all existing evaluation methods for ODED usually first construct evaluation benchmarks with limited labels and domain coverage, and then evaluate ODED methods using metrics based on token-level label matching rules. However, this kind of evaluation framework faces two issues: (1) The limited evaluation benchmarks lack representatives of the real world, making it difficult to accurately reflect the performance of various ODED methods in real-world scenarios; (2) Evaluation metrics based on token-level matching rules fail to capture semantic similarity between predictions and golden labels. To address these two problems above, we propose a scalable and reliable Semantic-level Evaluation framework for Open domain Event detection (SEOE) by constructing a more representative evaluation benchmark and introducing a semantic evaluation metric. Specifically, our proposed framework first constructs a scalable evaluation benchmark that currently includes 564 event types covering 7 major domains, with a cost-effective supplementary annotation strategy to ensure the benchmark's representativeness. The strategy also allows for the supplement of new event types and domains in the future. Then, the proposed SEOE leverages large language models (LLMs) as automatic evaluation agents to compute a semantic F1-score, incorporating fine-grained definitions of semantically similar labels to enhance the reliability of the evaluation. Extensive experiments validate the representatives of the benchmark and the reliability of the semantic evaluation metric. Existing ODED methods are thoroughly evaluated, and the error patterns of predictions are analyzed, revealing several insightful findings.


Technique Inference Engine: A Recommender Model to Support Cyber Threat Hunting

arXiv.org Artificial Intelligence

Cyber threat hunting is the practice of proactively searching for latent threats in a network. Engaging in threat hunting can be difficult due to the volume of network traffic, variety of adversary techniques, and constantly evolving vulnerabilities. To aid analysts in identifying techniques which may be co-occurring as part of a campaign, we present the Technique Inference Engine, a tool to infer tactics, techniques, and procedures (TTPs) which may be related to existing observations of adversarial behavior. We compile the largest (to our knowledge) available dataset of cyber threat intelligence (CTI) reports labeled with relevant TTPs. With the knowledge that techniques are chronically under-reported in CTI, we apply several implicit feedback recommender models to the data in order to predict additional techniques which may be part of a given campaign. We evaluate the results in the context of the cyber analyst's use case and apply t-SNE to visualize the model embeddings. We provide our code and a web interface.


FIG: Forward-Inverse Generation for Low-Resource Domain-specific Event Detection

arXiv.org Artificial Intelligence

Event Detection (ED) is the task of identifying typed event mentions of interest from natural language text, which benefits domain-specific reasoning in biomedical, legal, and epidemiological domains. However, procuring supervised data for thousands of events for various domains is a laborious and expensive task. To this end, existing works have explored synthetic data generation via forward (generating labels for unlabeled sentences) and inverse (generating sentences from generated labels) generations. However, forward generation often produces noisy labels, while inverse generation struggles with domain drift and incomplete event annotations. To address these challenges, we introduce FIG, a hybrid approach that leverages inverse generation for high-quality data synthesis while anchoring it to domain-specific cues extracted via forward generation on unlabeled target data. FIG further enhances its synthetic data by adding missing annotations through forward generation-based refinement. Experimentation on three ED datasets from diverse domains reveals that FIG outperforms the best baseline achieving average gains of 3.3% F1 and 5.4% F1 in the zero-shot and few-shot settings respectively. Analyzing the generated trigger hit rate and human evaluation substantiates FIG's superior domain alignment and data quality compared to existing baselines.


Multi-Agent Coordination across Diverse Applications: A Survey

arXiv.org Artificial Intelligence

Multi-agent coordination studies the underlying mechanism enabling the trending spread of diverse multi-agent systems (MAS) and has received increasing attention, driven by the expansion of emerging applications and rapid AI advances. This survey outlines the current state of coordination research across applications through a unified understanding that answers four fundamental coordination questions: (1) what is coordination; (2) why coordination; (3) who to coordinate with; and (4) how to coordinate. Our purpose is to explore existing ideas and expertise in coordination and their connections across diverse applications, while identifying and highlighting emerging and promising research directions. First, general coordination problems that are essential to varied applications are identified and analyzed. Second, a number of MAS applications are surveyed, ranging from widely studied domains, e.g., search and rescue, warehouse automation and logistics, and transportation systems, to emerging fields including humanoid and anthropomorphic robots, satellite systems, and large language models (LLMs). Finally, open challenges about the scalability, heterogeneity, and learning mechanisms of MAS are analyzed and discussed. In particular, we identify the hybridization of hierarchical and decentralized coordination, human-MAS coordination, and LLM-based MAS as promising future directions.


OCCULT: Evaluating Large Language Models for Offensive Cyber Operation Capabilities

arXiv.org Artificial Intelligence

The prospect of artificial intelligence (AI) competing in the adversarial landscape of cyber security has long been considered one of the most impactful, challenging, and potentially dangerous applications of AI. Here, we demonstrate a new approach to assessing AI's progress towards enabling and scaling real-world offensive cyber operations (OCO) tactics in use by modern threat actors. We detail OCCULT, a lightweight operational evaluation framework that allows cyber security experts to contribute to rigorous and repeatable measurement of the plausible cyber security risks associated with any given large language model (LLM) or AI employed for OCO. We also prototype and evaluate three very different OCO benchmarks for LLMs that demonstrate our approach and serve as examples for building benchmarks under the OCCULT framework. Finally, we provide preliminary evaluation results to demonstrate how this framework allows us to move beyond traditional all-or-nothing tests, such as those crafted from educational exercises like capture-the-flag environments, to contextualize our indicators and warnings in true cyber threat scenarios that present risks to modern infrastructure. We find that there has been significant recent advancement in the risks of AI being used to scale realistic cyber threats. For the first time, we find a model (DeepSeek-R1) is capable of correctly answering over 90% of challenging offensive cyber knowledge tests in our Threat Actor Competency Test for LLMs (TACTL) multiple-choice benchmarks. We also show how Meta's Llama and Mistral's Mixtral model families show marked performance improvements over earlier models against our benchmarks where LLMs act as offensive agents in MITRE's high-fidelity offensive and defensive cyber operations simulation environment, CyberLayer.